Uploading Data
Metchosin Insect Count Data
This insect count data was prepared by Freya Innes and then Larissa
Bron manually added the beginning and end of biweekly sampling intervals
(range_begin, range_end) in Excel, resulting in the Insect Count Data
(https://github.com/larissaissabron/MetchosinInsectBiomass/blob/main/Metchosin%20Project%20Metchosin%20Data.csv).
| Variable |
Description |
| trap_number |
Malaise trap site number |
| year |
Year insects collected in trap |
| month |
Biweekly sampling interval in which insects were collected in
trap |
| range_begin |
First day of the biweekly sampling interval, “month” |
| range_end |
Last day of the biweekly sampling interval, “month” |
| collection_date |
Date the insect trap contents from the preceding biweekly collection
interval were collected |
| group |
Major flying insect orders, one of: Hymenoptera, Diptera,
Lepidoptera, Hemiptera, Coleoptera, or Other |
| biomass |
Biomass measured per insect order (“group”) |
| biomass_total |
Biomass summed for all insect orders (“groups”) per biweekly
sampling interval (“month”) |
| percent_biomass |
Contribution of each insect order (“group”) to the total biomass per
biweekly sampling interval (“month”) |
| individual_count |
Number of individual insects counted for each insect order (“group”)
per biweekly sampling interval (“month”) |
| count_total |
Individual count summed for all insect orders (“groups”) per
biweekly sampling interval (“month”) |
The insect count data was uploaded to R and prepared to streamline
the workflow. Descriptions of the insect count data follow.
# Import .csv file of insect count data
insect_counts <- read_csv("data/Metchosin Project Metchosin Data.csv") %>%
# change all column names to be lower case to assist in joining and working between datasets later
setNames(
names(.) %>%
tolower()) %>%
# convert trap_number to a factor so that it is labelled appropriately in graphics
mutate_at(c('trap_number'), as.factor)
## Rows: 6078 Columns: 12
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (3): Month, Collection_Date, Group
## dbl (7): Trap_Number, Year, Biomass, Biomass_Total, Percent_Biomass, Indivi...
## date (2): Range_Begin, Range_End
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
# Change the format of collection date to be what r recognizes as dates
insect_counts <- insect_counts %>%
mutate(collection_date = dmy(collection_date))
# Separate collection date components to get a month for calculating monthly climate data
insect_counts <- insect_counts %>%
separate(collection_date, c('collect_year', 'collect_month', 'collect_day'), sep = "-",remove = FALSE)
# year, month, day to numeric instead of character
insect_counts$collect_year <- as.numeric(insect_counts$collect_year)
insect_counts$collect_month <- as.numeric(insect_counts$collect_month)
insect_counts$collect_day <- as.numeric(insect_counts$collect_day)
# change format from a dataframe to tibble to assist R in working with large dataset
as_tibble(insect_counts)
## # A tibble: 6,078 × 15
## trap_number year month range_begin range_end collection_date collect_year
## <fct> <dbl> <chr> <date> <date> <date> <dbl>
## 1 1 2018 (July … 2018-07-06 2018-07-20 2018-07-20 2018
## 2 1 2018 (July … 2018-07-06 2018-07-20 2018-07-20 2018
## 3 1 2018 (July … 2018-07-06 2018-07-20 2018-07-20 2018
## 4 1 2018 (July … 2018-07-06 2018-07-20 2018-07-20 2018
## 5 1 2018 (July … 2018-07-06 2018-07-20 2018-07-20 2018
## 6 1 2018 (July … 2018-07-06 2018-07-20 2018-07-20 2018
## 7 2 2018 (July … 2018-07-06 2018-07-20 2018-07-20 2018
## 8 2 2018 (July … 2018-07-06 2018-07-20 2018-07-20 2018
## 9 2 2018 (July … 2018-07-06 2018-07-20 2018-07-20 2018
## 10 2 2018 (July … 2018-07-06 2018-07-20 2018-07-20 2018
## # ℹ 6,068 more rows
## # ℹ 8 more variables: collect_month <dbl>, collect_day <dbl>, group <chr>,
## # biomass <dbl>, biomass_total <dbl>, percent_biomass <dbl>,
## # individual_count <dbl>, count_total <dbl>
# check out imported data to make sure nothing obviously wrong
summary(insect_counts)
## trap_number year month range_begin
## 14 : 414 Min. :2018 Length:6078 Min. :2018-07-06
## 9 : 408 1st Qu.:2019 Class :character 1st Qu.:2019-08-08
## 12 : 408 Median :2020 Mode :character Median :2020-09-01
## 13 : 408 Mean :2020 Mean :2020-11-28
## 4 : 402 3rd Qu.:2022 3rd Qu.:2022-05-01
## 5 : 396 Max. :2023 Max. :2023-10-02
## (Other):3642
## range_end collection_date collect_year collect_month
## Min. :2018-07-20 Min. :2018-07-20 Min. :2018 Min. : 5.000
## 1st Qu.:2019-08-21 1st Qu.:2019-08-21 1st Qu.:2019 1st Qu.: 6.000
## Median :2020-09-14 Median :2020-09-14 Median :2020 Median : 8.000
## Mean :2020-12-10 Mean :2020-12-11 Mean :2020 Mean : 7.694
## 3rd Qu.:2022-05-14 3rd Qu.:2022-05-14 3rd Qu.:2022 3rd Qu.: 9.000
## Max. :2023-10-15 Max. :2023-10-15 Max. :2023 Max. :10.000
##
## collect_day group biomass biomass_total
## Min. : 1.00 Length:6078 Min. : 0.000 Min. : 0.000
## 1st Qu.: 9.00 Class :character 1st Qu.: 0.130 1st Qu.: 1.610
## Median :16.00 Mode :character Median : 0.380 Median : 3.830
## Mean :16.19 Mean : 1.004 Mean : 6.026
## 3rd Qu.:24.00 3rd Qu.: 1.050 3rd Qu.: 8.150
## Max. :31.00 Max. :35.820 Max. :43.460
##
## percent_biomass individual_count count_total
## Min. : 0.00 Min. : 0.0 Min. : 0
## 1st Qu.: 4.55 1st Qu.: 32.0 1st Qu.: 659
## Median :11.85 Median : 99.0 Median :1258
## Mean :16.67 Mean : 281.9 Mean :1690
## 3rd Qu.:24.94 3rd Qu.: 288.0 3rd Qu.:2173
## Max. :93.78 Max. :8615.0 Max. :9827
## NA's :18
str(insect_counts)
## tibble [6,078 × 15] (S3: tbl_df/tbl/data.frame)
## $ trap_number : Factor w/ 20 levels "1","2","3","4",..: 1 1 1 1 1 1 2 2 2 2 ...
## $ year : num [1:6078] 2018 2018 2018 2018 2018 ...
## $ month : chr [1:6078] "(July 6 - 20)" "(July 6 - 20)" "(July 6 - 20)" "(July 6 - 20)" ...
## $ range_begin : Date[1:6078], format: "2018-07-06" "2018-07-06" ...
## $ range_end : Date[1:6078], format: "2018-07-20" "2018-07-20" ...
## $ collection_date : Date[1:6078], format: "2018-07-20" "2018-07-20" ...
## $ collect_year : num [1:6078] 2018 2018 2018 2018 2018 ...
## $ collect_month : num [1:6078] 7 7 7 7 7 7 7 7 7 7 ...
## $ collect_day : num [1:6078] 20 20 20 20 20 20 20 20 20 20 ...
## $ group : chr [1:6078] "Hymenoptera" "Diptera" "Lepidoptera" "Hemiptera" ...
## $ biomass : num [1:6078] 1.18 1.83 5.48 0.06 0.56 0.04 2.69 1.76 4.77 0.2 ...
## $ biomass_total : num [1:6078] 9.15 9.15 9.15 9.15 9.15 ...
## $ percent_biomass : num [1:6078] 12.9 20.01 59.87 0.7 6.13 ...
## $ individual_count: num [1:6078] 127 476 248 58 40 97 257 244 269 80 ...
## $ count_total : num [1:6078] 1046 1046 1046 1046 1046 ...
Visualization to confirm that both the biomass and individual counts
are following expected trends. Trends should include: all orders
represented, as well as each trap and year having a similar range of
values.
# plot trap number vs. biomass
ggplot(
# data into the ggplot
data = insect_counts,
mapping = aes(x = trap_number, y = biomass)
) +
# scatterplot
geom_point() +
# label
labs(
x = "Trap Number",
y = "Biomass (g)",
color = "Order",
title = "Biomass as a Function of Trap Number Compared Between Insect Orders"
) +
# angle text vertical
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1)) +
# present in a grid
facet_wrap(group ~ ., scales = 'free', ncol=2)

# plot trap number vs. individual count
ggplot(
# data into the ggplot
data = insect_counts,
mapping = aes(x = trap_number, y = individual_count)
) +
# scatterplot
geom_point() +
# label
labs(
x = "Trap Number",
y = "Individual Count",
color = "Order",
title = "Individual Count as a Function of Trap Number Compared Between Insect Orders"
) +
# angle text vertical
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1)) +
# present in a grid
facet_wrap(group ~ ., scales = 'free', ncol=2)

# plot year vs. biomass
ggplot(
# data into the plot
data = insect_counts,
mapping = aes(x = year, y = biomass)
) +
# scatter plot
geom_point() +
# label
labs(
x = "Year",
y = "Biomass",
color = "Order",
title = "Biomass as a Function of Year Compared Between Insect Orders"
) +
# angle text vertical
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1)) +
# present in a grid
facet_wrap(group ~ ., scales = 'free', ncol=2)

# plot year vs. individual count
ggplot(
# data into plot
data = insect_counts,
mapping = aes(x = year, y = individual_count)
) +
# scatterplot
geom_point() +
# label
labs(
x = "Year",
y = "Individual Count",
color = "Order",
title = "Individual Count as a Function of Year Compared Between Insect Orders"
) +
# angle text vertical
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1)) +
# present in a grid
facet_wrap(group ~ ., scales = 'free', ncol=2)

Climate Data
The Pacific Climate Impacts Consortium British Columbia Station Data
Explorer (https://services.pacificclimate.org/met-data-portal-pcds/app/)
was used to locate a relevant Environment Canada weather station for the
District of Metchosin.

Figure: The location of three relevant Environment Canada weather
stations within and near the District of Metchosin.
The Metchosin and William Head weather stations within Metchosin
boundaries were missing values of interest, either weather variables or
time series, and as such were excluded. The Esquimalt Harbour (Station
ID 1012710) was selected as the nearest location.
Daily weather data was downloaded from Environment Canada (https://climate-change.canada.ca/climate-data/#/daily-climate-data)
for the Esquimalt Harbour weather station between January 1, 2017 and
December 31, 2023. This data was then manually prepared in Excel by
removing all irrelevant data, resulting in the Climate Data file (https://github.com/larissaissabron/MetchosinInsectBiomass/blob/main/EsquimaltHarbour_ClimateData.csv).
| id |
n/a |
Trap number, year, month, day (format of
trapnumber.year.month.day) |
| min_temperature |
°C |
Minimum daily temperature |
| max_temperature |
°C |
Maximum daily temperature |
| mean_temperature |
°C |
Mean daily temperature |
| total_precipitation |
mm |
Total daily precipitation |
| speed_max_gust |
km/h |
Speed of the daily maximum wind gust |
The climate data was uploaded to R and prepared for analysis.
# Import .csv file of climate data.
climate <- read_csv("data/EsquimaltHarbour_ClimateData.csv") %>%
# change all column names to be lower case to assist in joining and working between datasets
setNames(
names(.) %>%
tolower()) %>%
# rename columns for clarity in future aggregations of climate data by distinguishing this as daily data.
rename(
daily_min_temp = min_temperature,
daily_max_temp = max_temperature,
daily_mean_temp = mean_temperature,
daily_tot_precip = total_precipitation,
daily_max_gust = speed_max_gust
)
## Rows: 2500 Columns: 6
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (1): ID
## dbl (5): MIN_TEMPERATURE, MAX_TEMPERATURE, MEAN_TEMPERATURE, TOTAL_PRECIPITA...
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
# change format from dataframe to tibble to assist R in working with large dataset
as_tibble(climate)
## # A tibble: 2,500 × 6
## id daily_min_temp daily_max_temp daily_mean_temp daily_tot_precip
## <chr> <dbl> <dbl> <dbl> <dbl>
## 1 1012710.2017.… 0 4.9 2.5 NA
## 2 1012710.2017.… -0.4 2.2 0.9 NA
## 3 1012710.2017.… -1.7 0.2 -0.8 NA
## 4 1012710.2017.… -1.1 4 1.5 NA
## 5 1012710.2017.… -2.5 3.2 0.4 NA
## 6 1012710.2017.… -0.7 4.6 2 NA
## 7 1012710.2017.… 1.1 4.9 3 0
## 8 1012710.2017.… 1.6 5.4 3.5 1.8
## 9 1012710.2017.… 3.4 7.1 5.3 5.6
## 10 1012710.2017.… -0.4 5 2.3 NA
## # ℹ 2,490 more rows
## # ℹ 1 more variable: daily_max_gust <dbl>
# Check out imported data to make sure nothing obviously wrong
summary(climate)
## id daily_min_temp daily_max_temp daily_mean_temp
## Length:2500 Min. :-8.100 Min. :-4.20 Min. :-6.10
## Class :character 1st Qu.: 4.300 1st Qu.: 9.00 1st Qu.: 6.70
## Mode :character Median : 7.600 Median :12.90 Median :10.20
## Mean : 7.609 Mean :13.47 Mean :10.54
## 3rd Qu.:11.400 3rd Qu.:17.80 3rd Qu.:14.60
## Max. :16.400 Max. :35.90 Max. :25.20
## NA's :27 NA's :33 NA's :33
## daily_tot_precip daily_max_gust
## Min. : 0.00 Min. : 0.00
## 1st Qu.: 0.00 1st Qu.:33.00
## Median : 0.00 Median :39.00
## Mean : 1.92 Mean :36.68
## 3rd Qu.: 1.20 3rd Qu.:46.00
## Max. :60.80 Max. :94.00
## NA's :137 NA's :1098
str(climate)
## spc_tbl_ [2,500 × 6] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
## $ id : chr [1:2500] "1012710.2017.1.1" "1012710.2017.1.2" "1012710.2017.1.3" "1012710.2017.1.4" ...
## $ daily_min_temp : num [1:2500] 0 -0.4 -1.7 -1.1 -2.5 -0.7 1.1 1.6 3.4 -0.4 ...
## $ daily_max_temp : num [1:2500] 4.9 2.2 0.2 4 3.2 4.6 4.9 5.4 7.1 5 ...
## $ daily_mean_temp : num [1:2500] 2.5 0.9 -0.8 1.5 0.4 2 3 3.5 5.3 2.3 ...
## $ daily_tot_precip: num [1:2500] NA NA NA NA NA NA 0 1.8 5.6 NA ...
## $ daily_max_gust : num [1:2500] 57 61 57 56 0 0 0 43 61 74 ...
## - attr(*, "spec")=
## .. cols(
## .. ID = col_character(),
## .. MIN_TEMPERATURE = col_double(),
## .. MAX_TEMPERATURE = col_double(),
## .. MEAN_TEMPERATURE = col_double(),
## .. TOTAL_PRECIPITATION = col_double(),
## .. SPEED_MAX_GUST = col_double()
## .. )
## - attr(*, "problems")=<externalptr>
Climate data was further manipulated to streamline the workflow for
aggregating climate variables of interest in later steps.
# Updating the climate dataframe.
climate <- climate %>%
# separate ID into it's component parts
separate(id, c('station_id', 'year', 'month', 'day')) %>%
# remove original ID column
select(-1) %>%
# get date into a format that is easier to work with in R
mutate(date = make_date(year, month, day)) %>%
# make date the first column
relocate(date) %>%
# remove working columns for year, month, and date
select(-2, -3, -4)
# find missing dates and add to dataframe with NAs (ChatGPT support to find out how to look for and fill in missing dates)
# look for missing dates. Generate a full sequence of dates from the minimum to maximum in the dataset
full_dates <- data.frame(date = seq(min(climate$date), max(climate$date), by = "day"))
# Identify missing dates by performing an anti_join
missing_dates <- anti_join(full_dates, climate, by = "date")
# Add missing dates to climate dataframe
climate <- climate %>%
right_join(full_dates, by = "date")
# Re-update the climate dataframe by separating out the year, month, and day
climate <- climate %>%
# separate ID into it's component parts
separate(date, c('year', 'month', 'day')) %>%
# get date into a format that is easier to work with in R
mutate(date = make_date(year, month, day)) %>%
# make date the first column
relocate(date)
# View updated climate data
print(climate)
## # A tibble: 2,556 × 9
## date year month day daily_min_temp daily_max_temp daily_mean_temp
## <date> <chr> <chr> <chr> <dbl> <dbl> <dbl>
## 1 2017-01-01 2017 01 01 0 4.9 2.5
## 2 2017-01-02 2017 01 02 -0.4 2.2 0.9
## 3 2017-01-03 2017 01 03 -1.7 0.2 -0.8
## 4 2017-01-04 2017 01 04 -1.1 4 1.5
## 5 2017-01-05 2017 01 05 -2.5 3.2 0.4
## 6 2017-01-06 2017 01 06 -0.7 4.6 2
## 7 2017-01-07 2017 01 07 1.1 4.9 3
## 8 2017-01-08 2017 01 08 1.6 5.4 3.5
## 9 2017-01-09 2017 01 09 3.4 7.1 5.3
## 10 2017-01-10 2017 01 10 -0.4 5 2.3
## # ℹ 2,546 more rows
## # ℹ 2 more variables: daily_tot_precip <dbl>, daily_max_gust <dbl>
# Confirm that nothing weird since updates to climate data
summary(climate)
## date year month day
## Min. :2017-01-01 Length:2556 Length:2556 Length:2556
## 1st Qu.:2018-10-01 Class :character Class :character Class :character
## Median :2020-07-01 Mode :character Mode :character Mode :character
## Mean :2020-07-01
## 3rd Qu.:2022-04-01
## Max. :2023-12-31
##
## daily_min_temp daily_max_temp daily_mean_temp daily_tot_precip
## Min. :-8.100 Min. :-4.20 Min. :-6.10 Min. : 0.00
## 1st Qu.: 4.300 1st Qu.: 9.00 1st Qu.: 6.70 1st Qu.: 0.00
## Median : 7.600 Median :12.90 Median :10.20 Median : 0.00
## Mean : 7.609 Mean :13.47 Mean :10.54 Mean : 1.92
## 3rd Qu.:11.400 3rd Qu.:17.80 3rd Qu.:14.60 3rd Qu.: 1.20
## Max. :16.400 Max. :35.90 Max. :25.20 Max. :60.80
## NA's :83 NA's :89 NA's :89 NA's :193
## daily_max_gust
## Min. : 0.00
## 1st Qu.:33.00
## Median :39.00
## Mean :36.68
## 3rd Qu.:46.00
## Max. :94.00
## NA's :1154
str(climate)
## tibble [2,556 × 9] (S3: tbl_df/tbl/data.frame)
## $ date : Date[1:2556], format: "2017-01-01" "2017-01-02" ...
## $ year : chr [1:2556] "2017" "2017" "2017" "2017" ...
## $ month : chr [1:2556] "01" "01" "01" "01" ...
## $ day : chr [1:2556] "01" "02" "03" "04" ...
## $ daily_min_temp : num [1:2556] 0 -0.4 -1.7 -1.1 -2.5 -0.7 1.1 1.6 3.4 -0.4 ...
## $ daily_max_temp : num [1:2556] 4.9 2.2 0.2 4 3.2 4.6 4.9 5.4 7.1 5 ...
## $ daily_mean_temp : num [1:2556] 2.5 0.9 -0.8 1.5 0.4 2 3 3.5 5.3 2.3 ...
## $ daily_tot_precip: num [1:2556] NA NA NA NA NA NA 0 1.8 5.6 NA ...
## $ daily_max_gust : num [1:2556] 57 61 57 56 0 0 0 43 61 74 ...
Visualization to confirm that daily recorded values make sense when
compared across years.
# daily mean temperature
ggplot(
# add data
data = climate,
mapping = aes(x = month, y = daily_mean_temp, na.rm = TRUE)) +
# scatterplot
geom_point() +
# label
labs(
x = "Month",
y = "Daily Mean Temperature (°C)",
title = "Daily Mean Temperature Summarized by Month (2017-2023)"
) +
# angle text vertical
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1)) +
# abbreviate month names on x-axis
scale_x_discrete(labels = month.abb) +
# present in a grid
facet_wrap(year ~ ., scales = 'free', ncol=4)

# daily minimum temperature
ggplot(
# add data
data = climate,
mapping = aes(x = month, y = daily_min_temp, na.rm = TRUE)) +
# scatterplot
geom_point() +
# label
labs(
x = "Month",
y = "Daily Minimum Temperature (°C)",
title = "Daily Minimum Temperature Summarized by Month (2017-2023)"
) +
# angle text vertical
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1)) +
# abbreviate month names on x-axis
scale_x_discrete(labels = month.abb) +
# present in a grid
facet_wrap(year ~ ., scales = 'free', ncol=4)

# daily maximum temperature
ggplot(
# add data
data = climate,
mapping = aes(x = month, y = daily_max_temp, na.rm = TRUE)) +
# scatterplot
geom_point() +
# label
labs(
x = "Month",
y = "Daily Maximum Temperature (°C)",
title = "Daily Maximum Temperature Summarized by Month (2017-2023)"
) +
# angle text vertical
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1)) +
# abbreviate month names on x-axis
scale_x_discrete(labels = month.abb) +
# present in a grid
facet_wrap(year ~ ., scales = 'free', ncol=4)

# daily total precipitation
ggplot(
# add data
data = climate,
mapping = aes(x = month, y = daily_tot_precip, na.rm = TRUE)) +
# scatterplot
geom_point() +
# label
labs(
x = "Month",
y = "Daily Total Precipitation (mm)",
title = "Daily Total Precipitation Summarized by Month (2017-2023)"
) +
# angle text vertical
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1)) +
# abbreviate month names on x-axis
scale_x_discrete(labels = month.abb) +
# present in a grid
facet_wrap(year ~ ., scales = 'free', ncol=4)

# daily maximum wind gust
ggplot(
# add data
data = climate,
mapping = aes(x = month, y = daily_max_gust, na.rm = TRUE)) +
# scatterplot
geom_point() +
# label
labs(
x = "Month",
y = "Daily Maximum Wind Gust (km/hr)",
title = "Daily Maximum Wind Gust Summarized by Month (2017-2023)"
) +
# angle text vertical
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1)) +
# abbreviate month names on x-axis
scale_x_discrete(labels = month.abb) +
# present in a grid
facet_wrap(year ~ ., scales = 'free', ncol=4)

Insect Trap Location Data
The insect trap location data file was created by Annika Meije and
then Larissa Bron converted the location coordinates to degrees,
minutes, seconds in QGis, resulting in Traps Data file (https://github.com/larissaissabron/MetchosinInsectBiomass/blob/main/Traps.csv).

Figure: The location of the Metchosin Insect Biomass Project within
the spatial context Vancouver Island, the Salish Sea, and the west coast
of British Columbia and Washington.

Figure: The 20 insect trap locations spread throughout the District
of Metchosin.
The insect trap location data was loaded into R.
# Import .csv file of trap location data.
traps <- read.csv("data/Traps.csv", fileEncoding = "UTF-8")
# change format from dataframe to tibble to assist R in working with large dataset
as_tibble(traps)
## # A tibble: 20 × 3
## trap_number latitude longitude
## <int> <chr> <chr>
## 1 1 48°24′19.8701″ -123°33′21.3825″
## 2 2 48°23′47.2428″ -123°34′6.3267″
## 3 3 48°23′29.5436″ -123°34′34.1539″
## 4 4 48°22′46.7819″ -123°35′9.5595″
## 5 5 48°23′3.5567″ -123°33′16.2336″
## 6 6 48°22′11.3156″ -123°32′1.6113″
## 7 7 48°22′27.3518″ -123°32′17.4123″
## 8 8 48°22′16.1411″ -123°33′25.1513″
## 9 9 48°21′31.5438″ -123°34′35.4979″
## 10 10 48°22′22.8000″ -123°32′5.9999″
## 11 11 48°23′11.7224″ -123°32′30.0531″
## 12 12 48°21′16.8023″ -123°33′25.8695″
## 13 13 48°23′28.5940″ -123°32′57.0983″
## 14 14 48°22′21.3059″ -123°33′57.7102″
## 15 15 48°22′50.4664″ -123°31′26.8201″
## 16 16 48°24′59.9873″ -123°38′13.5073″
## 17 17 48°21′14.8168″ -123°32′48.1251″
## 18 18 48°22′14.4895″ -123°31′52.2609″
## 19 19 48°23′23.4576″ -123°30′14.2268″
## 20 20 48°22′45.8311″ -123°33′44.5944″
# change trap number to be a factor
traps$trap_number <- as.factor(traps$trap_number)
# Check out imported data to make sure nothing obviously wrong
summary(traps)
## trap_number latitude longitude
## 1 : 1 Length:20 Length:20
## 2 : 1 Class :character Class :character
## 3 : 1 Mode :character Mode :character
## 4 : 1
## 5 : 1
## 6 : 1
## (Other):14
str(traps)
## 'data.frame': 20 obs. of 3 variables:
## $ trap_number: Factor w/ 20 levels "1","2","3","4",..: 1 2 3 4 5 6 7 8 9 10 ...
## $ latitude : chr "48°24′19.8701″" "48°23′47.2428″" "48°23′29.5436″" "48°22′46.7819″" ...
## $ longitude : chr "-123°33′21.3825″" "-123°34′6.3267″" "-123°34′34.1539″" "-123°35′9.5595″" ...
Land Cover Data
The Traps.csv file was uploaded to QGis along with the Capital
Regional District 1m Resolution Land Cover Raster Data (https://www.crd.bc.ca/about/data/geospatial-data) to
extract land cover at buffers of 50, 100, 250, and 500 metres around
each insect trap
Steps:
Mapping projection set as WGS 84/UTM Zone 10N to allow distance
calculation using meters (instead of degrees).
The symbology of the land cover data was updated in QGis to
reflect the labels provided in the Land Cover Data Report (https://maps.crd.bc.ca/Media/Reports/Land%20Cover%202017-2019%20LiDAR%20Enhanced%20Version.zip).

The land cover data layer was clipped near the Metchosin boundary
to limit rendering and processing power required to work with the
file.

Circles with diameter 50, 100, 250, and 500 metres, “buffers,”
were created around each insect trap using the Buffer tool in QGis.

For each buffer, the Raster Calculator tool in QGis was used to
calculate the number of pixels of each land cover category contained
within each buffer, the total number of pixels within each buffer, and
to aggregate separately the anthropogenic and natural features.
The resulting .csv file for each buffer was then exported from
QGis and imported into R.
The land cover data extracted at each buffer size in meters is
available:
# Import .csv files for land cover data extracted at 50, 100, 250, and 500 metre buffers
# 50 meter buffer around each insect trap
landcover_50m <- read.csv("data/ZonalHistogram_50m.csv") %>%
# remove lat and long because redundant
select(-2, -3) %>%
# rename land cover columns so they are easier to work with
rename(
trap_number = Id,
exposed_soil = HISTO_7,
grass = HISTO_8,
herb = HISTO_9,
shrub_small_tree = HISTO_10,
deciduous_tree = HISTO_11,
coniferous_tree = HISTO_12,
pavement_packed_gravel = HISTO_14,
road = HISTO_15,
buildings = HISTO_16,
agriculture = HISTO_17,
exposed_bedrock = HISTO_18
) %>%
# add a column for the size of the buffer
mutate(buffer = 50) %>%
# total number of pixels (each pixel is 1m x 1m, so it is equivalent to area)
mutate(total_area = exposed_soil + grass + herb + shrub_small_tree + deciduous_tree + coniferous_tree + pavement_packed_gravel + road + buildings + agriculture + exposed_bedrock) %>%
# total natural
mutate(natural = exposed_soil + grass + herb + shrub_small_tree + deciduous_tree + coniferous_tree + exposed_bedrock) %>%
# total anthropogenic
mutate(anthropogenic = pavement_packed_gravel + road + buildings + agriculture)
# 100 meter buffer
landcover_100m <- read.csv("data/ZonalHistogram_100m.csv") %>%
# remove lat and long because redundant
select(-2, -3) %>%
# rename landcover columns so they are easier to work with
rename(
trap_number = Id,
ocean = HISTO_1,
lake = HISTO_2,
exposed_soil = HISTO_7,
grass = HISTO_8,
herb = HISTO_9,
shrub_small_tree = HISTO_10,
deciduous_tree = HISTO_11,
coniferous_tree = HISTO_12,
pavement_packed_gravel = HISTO_14,
road = HISTO_15,
buildings = HISTO_16,
agriculture = HISTO_17,
exposed_bedrock = HISTO_18,
aq_vegetation = HISTO_22) %>%
# add a column for the size of the buffer
mutate(buffer = 100) %>%
# total number of pixels (each pixel is 1m x 1m, so it is equivalent to area)
mutate(total_area = ocean + lake + exposed_soil + grass + herb + shrub_small_tree + deciduous_tree + coniferous_tree + pavement_packed_gravel + road + buildings + agriculture + exposed_bedrock + aq_vegetation) %>%
# total natural
mutate(natural = ocean + lake + exposed_soil + grass + herb + shrub_small_tree + deciduous_tree + coniferous_tree + exposed_bedrock + aq_vegetation) %>%
# total anthropogenic
mutate(anthropogenic = pavement_packed_gravel + road + buildings + agriculture)
# 250 meter buffer
landcover_250m <- read.csv("data/ZonalHistogram_250m.csv") %>%
# remove lat and long because redundant
select(-2, -3) %>%
# rename landcover columns so they are easier to work with
rename(
trap_number = Id,
ocean = HISTO_1,
lake = HISTO_2,
sand_gravel_shore = HISTO_5,
bedrock_shore = HISTO_6,
exposed_soil = HISTO_7,
grass = HISTO_8,
herb = HISTO_9,
shrub_small_tree = HISTO_10,
deciduous_tree = HISTO_11,
coniferous_tree = HISTO_12,
docks = HISTO_13,
pavement_packed_gravel = HISTO_14,
road = HISTO_15,
buildings = HISTO_16,
agriculture = HISTO_17,
exposed_bedrock = HISTO_18,
aq_vegetation = HISTO_22) %>%
# add a column for the size of the buffer
mutate(buffer = 250) %>%
# total number of pixels (each pixel is 1m x 1m, so it is equivalent to area)
mutate(total_area = ocean + lake + sand_gravel_shore + bedrock_shore + exposed_soil + grass + herb + shrub_small_tree + deciduous_tree + coniferous_tree + docks + pavement_packed_gravel + road + buildings + agriculture + exposed_bedrock + aq_vegetation) %>%
# total natural
mutate(natural = ocean + lake + sand_gravel_shore + bedrock_shore + exposed_soil + grass + herb + shrub_small_tree + deciduous_tree + coniferous_tree + exposed_bedrock + aq_vegetation) %>%
# total anthropogenic
mutate(anthropogenic = docks + pavement_packed_gravel + road + buildings + agriculture)
# 500 meter buffer
landcover_500m <- read.csv("data/ZonalHistogram_500m.csv") %>%
# remove lat and long because redundant
select(-2, -3) %>%
# rename landcover columns so they are easier to work with
rename(
trap_number = Id,
ocean = HISTO_1,
lake = HISTO_2,
sand_gravel_shore = HISTO_5,
bedrock_shore = HISTO_6,
exposed_soil = HISTO_7,
grass = HISTO_8,
herb = HISTO_9,
shrub_small_tree = HISTO_10,
deciduous_tree = HISTO_11,
coniferous_tree = HISTO_12,
docks = HISTO_13,
pavement_packed_gravel = HISTO_14,
road = HISTO_15,
buildings = HISTO_16,
agriculture = HISTO_17,
exposed_bedrock = HISTO_18,
aq_vegetation = HISTO_22,
loose_gravel = HISTO_24) %>%
# add a column for the size of the buffer
mutate(buffer = 500) %>%
# total number of pixels (each pixel is 1m x 1m, so it is equivalent to area)
mutate(total_area = ocean + lake + sand_gravel_shore + bedrock_shore + exposed_soil + grass + herb + shrub_small_tree + deciduous_tree + coniferous_tree + docks + pavement_packed_gravel + road + buildings + agriculture + exposed_bedrock + aq_vegetation + loose_gravel) %>%
# total natural
mutate(natural = ocean + lake + sand_gravel_shore + bedrock_shore + exposed_soil + grass + herb + shrub_small_tree + deciduous_tree + coniferous_tree + exposed_bedrock + aq_vegetation + loose_gravel) %>%
# total anthropogenic
mutate(anthropogenic = docks + pavement_packed_gravel + road + buildings + agriculture)
# join together the landcover data
landcover <- bind_rows(landcover_50m, landcover_100m, landcover_250m, landcover_500m)
# adjust the landcover data
landcover <- landcover %>%
# Make all NAs to 0s
replace(is.na(landcover), 0) %>%
relocate(buffer, .after = trap_number) %>%
relocate(total_area, .after = buffer)
# update trap number to be a factor
landcover$trap_number <- as.factor(landcover$trap_number)
# Check out imported data to make sure nothing obviously wrong
summary(landcover)
## trap_number buffer total_area exposed_soil
## 1 : 4 Min. : 50.0 Min. : 7845 Min. : 0.00
## 2 : 4 1st Qu.: 87.5 1st Qu.: 25500 1st Qu.: 83.75
## 3 : 4 Median :175.0 Median :113800 Median : 788.50
## 4 : 4 Mean :225.0 Mean :255088 Mean : 3236.19
## 5 : 4 3rd Qu.:312.5 3rd Qu.:343394 3rd Qu.: 3621.00
## 6 : 4 Max. :500.0 Max. :784915 Max. :25191.00
## (Other):56
## grass herb shrub_small_tree deciduous_tree
## Min. : 0 Min. : 0.0 Min. : 43 Min. : 804
## 1st Qu.: 897 1st Qu.: 563.5 1st Qu.: 1040 1st Qu.: 3551
## Median : 5440 Median : 3316.5 Median : 4298 Median : 18979
## Mean : 19933 Mean : 9525.3 Mean :15420 Mean : 47954
## 3rd Qu.: 28025 3rd Qu.:14470.8 3rd Qu.:21172 3rd Qu.: 71200
## Max. :146646 Max. :46590.0 Max. :70352 Max. :230305
##
## coniferous_tree pavement_packed_gravel road buildings
## Min. : 1154 Min. : 0.0 Min. : 0 Min. : 0.0
## 1st Qu.: 5870 1st Qu.: 179.8 1st Qu.: 0 1st Qu.: 113.8
## Median : 26842 Median : 750.0 Median : 1368 Median : 651.5
## Mean :106498 Mean : 3697.6 Mean : 3427 Mean : 3245.7
## 3rd Qu.:131309 3rd Qu.: 5273.0 3rd Qu.: 4674 3rd Qu.: 3595.0
## Max. :549144 Max. :34899.0 Max. :25632 Max. :37996.0
##
## agriculture exposed_bedrock natural anthropogenic
## Min. : 0 Min. : 0.0 Min. : 3020 Min. : 0
## 1st Qu.: 0 1st Qu.: 0.0 1st Qu.: 13826 1st Qu.: 1096
## Median : 0 Median : 0.0 Median : 63030 Median : 7594
## Mean : 29931 Mean : 431.3 Mean :214785 Mean : 40303
## 3rd Qu.: 11856 3rd Qu.: 80.0 3rd Qu.:246791 3rd Qu.: 32582
## Max. :345214 Max. :22785.0 Max. :777399 Max. :383525
##
## ocean lake aq_vegetation sand_gravel_shore
## Min. : 0 Min. : 0.0 Min. : 0.00 Min. : 0
## 1st Qu.: 0 1st Qu.: 0.0 1st Qu.: 0.00 1st Qu.: 0
## Median : 0 Median : 0.0 Median : 0.00 Median : 0
## Mean : 8600 Mean : 1532.1 Mean : 43.50 Mean : 1264
## 3rd Qu.: 0 3rd Qu.: 900.8 3rd Qu.: 0.25 3rd Qu.: 0
## Max. :218199 Max. :28164.0 Max. :906.00 Max. :26665
##
## bedrock_shore docks loose_gravel
## Min. : 0.0 Min. : 0.000 Min. : 0.00
## 1st Qu.: 0.0 1st Qu.: 0.000 1st Qu.: 0.00
## Median : 0.0 Median : 0.000 Median : 0.00
## Mean : 334.2 Mean : 1.837 Mean : 12.65
## 3rd Qu.: 0.0 3rd Qu.: 0.000 3rd Qu.: 0.00
## Max. :15989.0 Max. :85.000 Max. :1012.00
##
str(landcover)
## 'data.frame': 80 obs. of 23 variables:
## $ trap_number : Factor w/ 20 levels "1","2","3","4",..: 1 2 3 4 5 6 7 8 9 10 ...
## $ buffer : num 50 50 50 50 50 50 50 50 50 50 ...
## $ total_area : int 7851 7845 7851 7849 7854 7852 7846 7850 7855 7845 ...
## $ exposed_soil : int 10 125 140 5 7 0 3 37 194 22 ...
## $ grass : int 148 0 127 379 209 619 2411 472 1622 609 ...
## $ herb : int 70 102 647 45 177 319 580 514 258 422 ...
## $ shrub_small_tree : int 190 580 1052 359 253 380 604 839 926 645 ...
## $ deciduous_tree : int 1505 1323 1308 804 2190 1203 1829 1297 1039 1082 ...
## $ coniferous_tree : int 5928 5697 3857 6257 5018 3029 2121 4442 2720 3437 ...
## $ pavement_packed_gravel: int 0 9 130 0 0 0 15 25 207 185 ...
## $ road : int 0 0 524 0 0 0 0 0 680 365 ...
## $ buildings : int 0 0 9 0 0 0 178 217 209 0 ...
## $ agriculture : int 0 0 0 0 0 2302 105 0 0 1078 ...
## $ exposed_bedrock : int 0 9 57 0 0 0 0 7 0 0 ...
## $ natural : int 7851 7836 7188 7849 7854 5550 7548 7608 6759 6217 ...
## $ anthropogenic : int 0 9 663 0 0 2302 298 242 1096 1628 ...
## $ ocean : num 0 0 0 0 0 0 0 0 0 0 ...
## $ lake : num 0 0 0 0 0 0 0 0 0 0 ...
## $ aq_vegetation : num 0 0 0 0 0 0 0 0 0 0 ...
## $ sand_gravel_shore : num 0 0 0 0 0 0 0 0 0 0 ...
## $ bedrock_shore : num 0 0 0 0 0 0 0 0 0 0 ...
## $ docks : num 0 0 0 0 0 0 0 0 0 0 ...
## $ loose_gravel : num 0 0 0 0 0 0 0 0 0 0 ...
Visualization to confirm that the land cover data looks like it is
spread across the insect trap sites at all four buffer sizes.
# Create custom colour pallet for land cover types
landcover_colors <- c(
"exposed_soil" = "#6e554a",
"grass" = "#5c8a0e",
"herb" = "#aae651",
"shrub_small_tree" = "#539f6b",
"deciduous_tree" = "#b6b639",
"coniferous_tree" = "#0d3e05",
"pavement_packed_gravel" = "#7d9693",
"road" = "#565860",
"buildings" = "#3e392e",
"agriculture" = "#df974d",
"exposed_bedrock" = "#b0b2b2",
"ocean" = "#2b37df",
"lake" = "#1899c8",
"aq_vegetation" = "#81c298",
"sand_gravel_shore" = "#dddd82",
"bedrock_shore" = "#60657b",
"docks" = "#1a1e1b",
"loose_gravel" = "#646462"
)
# Create custom names for land cover labelling
landcover_names <- c(
"exposed_soil" = "Exposed Soil",
"grass" = "Grass",
"herb" = "Herb",
"shrub_small_tree" = "Shrubs",
"deciduous_tree" = "Deciduous Tree",
"coniferous_tree" = "Coniferous Tree",
"pavement_packed_gravel" = "Pavement",
"road" = "Road",
"buildings" = "Buildings",
"agriculture" = "Agriculture",
"exposed_bedrock" = "Exposed Bedrock",
"ocean" = "Ocean",
"lake" = "Lake",
"aq_vegetation" = "Aquatic Vegetation",
"sand_gravel_shore" = "Sand Shoreline",
"bedrock_shore" = "Bedrock Shoreline",
"docks" = "Docks",
"loose_gravel" = "Loose Gravel"
)
# 50m buffer
# Create a visualization dataframe for 50m buffer
landcover_50_vis <- landcover %>%
# select landcover extracted with a 50m buffer
filter(buffer == 50) %>%
# remove buffer size, total number of pixels, anthropogenic aggregation, natural aggregation
select(-2, -3, -15, -16) %>%
# exchange x and y values
pivot_longer(
cols = -trap_number,
names_to = "landcover",
values_to = "meters"
) %>%
# change to factors so they are visualized appropriately
mutate_at(c('landcover'), as.factor) %>%
mutate_at(c('trap_number'), as.factor)
# Plot the coverage of land cover types in a 50m buffer around each trap
ggplot(
# Add data
landcover_50_vis, aes(x = trap_number, y = meters, fill = landcover)) +
# bar plot
geom_bar(stat = "identity", position = "stack") +
# custom colours and names for legend based on landcover values
scale_fill_manual(values = landcover_colors,
labels = landcover_names) +
# label
labs(
x = "Trap Number",
y = "Area (m<sup>2</sup>)",
fill = "Land Cover",
title = "Land Cover Within a 50 Metre Buffer of Each Insect Trap") +
ylab(bquote('Area'~(m^2))) +
# move legend to the bottom and make it less awful
theme(
legend.position = "bottom",
legend.justification = "left",
legend.text = element_text(size = 8),
legend.title = element_text(angle = 90, vjust = 0.5, hjust = 0.5)) +
guides(fill=guide_legend(ncol=5))

# 100m buffer
# Create a visualization dataframe for 100m buffer
landcover_100_vis <- landcover %>%
# select landcover extracted with a 100m buffer
filter(buffer == 100) %>%
# remove buffer size, total number of pixels, anthropogenic aggregation, natural aggregation
select(-2, -3, -15, -16) %>%
# exchange x and y values
pivot_longer(
cols = -trap_number,
names_to = "landcover",
values_to = "meters"
) %>%
# change to factors so they are visualized appropriately
mutate_at(c('landcover'), as.factor) %>%
mutate_at(c('trap_number'), as.factor)
# Plot the coverage of land cover types in a 100m buffer around each trap
ggplot(
# Add data
landcover_100_vis, aes(x = trap_number, y = meters, fill = landcover)) +
# bar plot
geom_bar(stat = "identity", position = "stack") +
# custom colours and names for legend based on landcover values
scale_fill_manual(values = landcover_colors,
labels = landcover_names) +
# label
labs(
x = "Trap Number",
y = "Area (m<sup>2</sup>)",
fill = "Land Cover",
title = "Land Cover Within a 100 Metre Buffer of Each Insect Trap") +
ylab(bquote('Area'~(m^2))) +
# move legend to the bottom and make it less awful
theme(
legend.position = "bottom",
legend.justification = "left",
legend.text = element_text(size = 8),
legend.title = element_text(angle = 90, vjust = 0.5, hjust = 0.5)) +
guides(fill=guide_legend(ncol=5))

# 250m buffer
# Create a visualization dataframe for 250m buffer
landcover_250_vis <- landcover %>%
# select landcover extracted with a 250m buffer
filter(buffer == 250) %>%
# remove buffer size, total number of pixels, anthropogenic aggregation, natural aggregation
select(-2, -3, -15, -16) %>%
# exchange x and y values
pivot_longer(
cols = -trap_number,
names_to = "landcover",
values_to = "meters"
) %>%
# change to factors so they are visualized appropriately
mutate_at(c('landcover'), as.factor) %>%
mutate_at(c('trap_number'), as.factor)
# Plot the coverage of land cover types in a 250m buffer around each trap
ggplot(
# Add data
landcover_250_vis, aes(x = trap_number, y = meters, fill = landcover)) +
# bar plot
geom_bar(stat = "identity", position = "stack") +
# custom colours and names for legend based on landcover values
scale_fill_manual(values = landcover_colors,
labels = landcover_names) +
# label
labs(
x = "Trap Number",
y = "Area (m<sup>2</sup>)",
fill = "Land Cover",
title = "Land Cover Within a 250 Metre Buffer of Each Insect Trap") +
ylab(bquote('Area'~(m^2))) +
# move legend to the bottom and make it less awful
theme(
legend.position = "bottom",
legend.justification = "left",
legend.text = element_text(size = 8),
legend.title = element_text(angle = 90, vjust = 0.5, hjust = 0.5)) +
guides(fill=guide_legend(ncol=5))

# 500m buffer
# Create a visualization dataframe for 500m buffer
landcover_500_vis <- landcover %>%
# select landcover extracted with a 500m buffer
filter(buffer == 500) %>%
# remove buffer size, total number of pixels, anthropogenic aggregation, natural aggregation
select(-2, -3, -15, -16) %>%
# exchange x and y values
pivot_longer(
cols = -trap_number,
names_to = "landcover",
values_to = "meters"
) %>%
# change to factors so they are visualized appropriately
mutate_at(c('landcover'), as.factor) %>%
mutate_at(c('trap_number'), as.factor)
# Plot the coverage of land cover types in a 500m buffer around each trap
ggplot(
# Add data
landcover_500_vis, aes(x = trap_number, y = meters, fill = landcover)) +
# bar plot
geom_bar(stat = "identity", position = "stack") +
# custom colours and names for legend based on landcover values
scale_fill_manual(values = landcover_colors,
labels = landcover_names) +
# label
labs(
x = "Trap Number",
y = "Area (m<sup>2</sup>)",
fill = "Land Cover",
title = "Land Cover Within a 500 Metre Buffer of Each Insect Trap") +
ylab(bquote('Area'~(m^2))) +
# move legend to the bottom and make it less awful
theme(
legend.position = "bottom",
legend.justification = "left",
legend.text = element_text(size = 8),
legend.title = element_text(angle = 90, vjust = 0.5, hjust = 0.5)) +
guides(fill=guide_legend(ncol=5))

Daily Climate Data Aggregations
1. Biweekly
For each biweekly sampling interval, I calculated the mean of (1)
Minimum temperature, (2) Maximum temperature, (3) Mean temperature, (4)
Precipitation, and (5) Max wind gust - all of the relevant variables
available from Environment Canada. This aggregation is similar to Müller
et al. (2024). Additionally, I added total precipitation calculated per
sampling window (Müller et. al, 2024; Seibold et. al, 2019).
# Function to aggregate daily climate data by the date range of our biweekly sampling intervals (ChatGPT code that I debugged. Used ChatGPT because I didn't know the solution to this problem was making a function).
aggregate_climate_data <- function(start_date, end_date) {
climate %>%
filter(date >= start_date & date <= end_date) %>%
summarize(
sampling_mean_min_temp = mean(daily_min_temp, na.rm = TRUE),
sampling_mean_max_temp = mean(daily_max_temp, na.rm = TRUE),
sampling_mean_mean_temp = mean(daily_mean_temp, na.rm = TRUE),
sampling_mean_precipitation = mean(daily_tot_precip, na.rm = TRUE),
sampling_mean_max_gust = mean(daily_max_gust, na.rm = TRUE),
sampling_tot_precipitation = sum(daily_tot_precip, na.rm = TRUE)
)
}
# Apply the function for each date range in insect_counts and add the results as new columns
aggregated_climate <- map2_dfr(insect_counts$range_begin, insect_counts$range_end, aggregate_climate_data)
# Add the aggregated biweekly sampling interval climate data as columns to the insect_counts dataframe
insect_counts <- bind_cols(insect_counts, aggregated_climate)
# Review additions to check that they make sense.
str(insect_counts)
## tibble [6,078 × 21] (S3: tbl_df/tbl/data.frame)
## $ trap_number : Factor w/ 20 levels "1","2","3","4",..: 1 1 1 1 1 1 2 2 2 2 ...
## $ year : num [1:6078] 2018 2018 2018 2018 2018 ...
## $ month : chr [1:6078] "(July 6 - 20)" "(July 6 - 20)" "(July 6 - 20)" "(July 6 - 20)" ...
## $ range_begin : Date[1:6078], format: "2018-07-06" "2018-07-06" ...
## $ range_end : Date[1:6078], format: "2018-07-20" "2018-07-20" ...
## $ collection_date : Date[1:6078], format: "2018-07-20" "2018-07-20" ...
## $ collect_year : num [1:6078] 2018 2018 2018 2018 2018 ...
## $ collect_month : num [1:6078] 7 7 7 7 7 7 7 7 7 7 ...
## $ collect_day : num [1:6078] 20 20 20 20 20 20 20 20 20 20 ...
## $ group : chr [1:6078] "Hymenoptera" "Diptera" "Lepidoptera" "Hemiptera" ...
## $ biomass : num [1:6078] 1.18 1.83 5.48 0.06 0.56 0.04 2.69 1.76 4.77 0.2 ...
## $ biomass_total : num [1:6078] 9.15 9.15 9.15 9.15 9.15 ...
## $ percent_biomass : num [1:6078] 12.9 20.01 59.87 0.7 6.13 ...
## $ individual_count : num [1:6078] 127 476 248 58 40 97 257 244 269 80 ...
## $ count_total : num [1:6078] 1046 1046 1046 1046 1046 ...
## $ sampling_mean_min_temp : num [1:6078] 12.8 12.8 12.8 12.8 12.8 ...
## $ sampling_mean_max_temp : num [1:6078] 20.7 20.7 20.7 20.7 20.7 ...
## $ sampling_mean_mean_temp : num [1:6078] 16.7 16.7 16.7 16.7 16.7 ...
## $ sampling_mean_precipitation: num [1:6078] 0 0 0 0 0 0 0 0 0 0 ...
## $ sampling_mean_max_gust : num [1:6078] 39 39 39 39 39 39 39 39 39 39 ...
## $ sampling_tot_precipitation : num [1:6078] 0 0 0 0 0 0 0 0 0 0 ...
summary(insect_counts)
## trap_number year month range_begin
## 14 : 414 Min. :2018 Length:6078 Min. :2018-07-06
## 9 : 408 1st Qu.:2019 Class :character 1st Qu.:2019-08-08
## 12 : 408 Median :2020 Mode :character Median :2020-09-01
## 13 : 408 Mean :2020 Mean :2020-11-28
## 4 : 402 3rd Qu.:2022 3rd Qu.:2022-05-01
## 5 : 396 Max. :2023 Max. :2023-10-02
## (Other):3642
## range_end collection_date collect_year collect_month
## Min. :2018-07-20 Min. :2018-07-20 Min. :2018 Min. : 5.000
## 1st Qu.:2019-08-21 1st Qu.:2019-08-21 1st Qu.:2019 1st Qu.: 6.000
## Median :2020-09-14 Median :2020-09-14 Median :2020 Median : 8.000
## Mean :2020-12-10 Mean :2020-12-11 Mean :2020 Mean : 7.694
## 3rd Qu.:2022-05-14 3rd Qu.:2022-05-14 3rd Qu.:2022 3rd Qu.: 9.000
## Max. :2023-10-15 Max. :2023-10-15 Max. :2023 Max. :10.000
##
## collect_day group biomass biomass_total
## Min. : 1.00 Length:6078 Min. : 0.000 Min. : 0.000
## 1st Qu.: 9.00 Class :character 1st Qu.: 0.130 1st Qu.: 1.610
## Median :16.00 Mode :character Median : 0.380 Median : 3.830
## Mean :16.19 Mean : 1.004 Mean : 6.026
## 3rd Qu.:24.00 3rd Qu.: 1.050 3rd Qu.: 8.150
## Max. :31.00 Max. :35.820 Max. :43.460
##
## percent_biomass individual_count count_total sampling_mean_min_temp
## Min. : 0.00 Min. : 0.0 Min. : 0 Min. : 6.50
## 1st Qu.: 4.55 1st Qu.: 32.0 1st Qu.: 659 1st Qu.:10.20
## Median :11.85 Median : 99.0 Median :1258 Median :11.62
## Mean :16.67 Mean : 281.9 Mean :1690 Mean :11.20
## 3rd Qu.:24.94 3rd Qu.: 288.0 3rd Qu.:2173 3rd Qu.:12.81
## Max. :93.78 Max. :8615.0 Max. :9827 Max. :13.55
## NA's :18
## sampling_mean_max_temp sampling_mean_mean_temp sampling_mean_precipitation
## Min. :11.69 Min. : 9.207 Min. :0.00000
## 1st Qu.:16.27 1st Qu.:13.279 1st Qu.:0.04615
## Median :18.19 Median :14.929 Median :0.44286
## Mean :18.14 Mean :14.670 Mean :0.90678
## 3rd Qu.:20.11 3rd Qu.:16.629 3rd Qu.:1.22857
## Max. :23.96 Max. :18.750 Max. :4.54286
##
## sampling_mean_max_gust sampling_tot_precipitation
## Min. :34.40 Min. : 0.00
## 1st Qu.:38.33 1st Qu.: 0.60
## Median :41.30 Median : 6.20
## Mean :41.16 Mean :12.54
## 3rd Qu.:43.33 3rd Qu.:17.20
## Max. :57.00 Max. :63.60
##
Visualization to confirm that climate data aggregated at the level of
biweekly sampling interval is appropriately distributed in time.
# plot of the sampling interval mean minimum temperature grouped by months and compared between years
ggplot(
# input data
data = insect_counts,
mapping = aes(x = collect_month, y = sampling_mean_min_temp)) +
# scatterplot
geom_point() +
# label everything
labs(
x = "Month",
y = "Mean Minimum Temperature (°C) of the Sampling Intervals",
title = "The Mean Minimum Temperature of the Sampling Intervals Compared\nMonthly Between Years"
) +
# improve the look of the x-axis
# angle text vertical
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1),
legend.position = "none") +
# labelling x axis with abbreviated month names
scale_x_continuous(
breaks = unique(insect_counts$collect_month),
labels = month.abb[unique(insect_counts$collect_month)]
) +
# present in a grid of years
facet_wrap(collect_year ~ ., scales = 'free', ncol=4)

# plot of the sampling interval mean of the mean temperature grouped by months and compared between years
ggplot(
# input data
data = insect_counts,
mapping = aes(x = collect_month, y = sampling_mean_mean_temp)) +
# scatterplot
geom_point() +
# label everything
labs(
x = "Month",
y = "Mean of the Mean Temperature (°C) of the Sampling Intervals",
title = "The Mean of the Mean Temperature of the Sampling Intervals Compared\nMonthly Between Years"
) +
# improve the look of the x-axis
# angle text vertical
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1),
legend.position = "none") +
# labelling x axis with abbreviated month names
scale_x_continuous(
breaks = unique(insect_counts$collect_month),
labels = month.abb[unique(insect_counts$collect_month)]
) +
# present in a grid of years
facet_wrap(collect_year ~ ., scales = 'free', ncol=4)

# plot of the sampling interval mean of the maximum temperature grouped by months and compared between years
ggplot(
# input data
data = insect_counts,
mapping = aes(x = collect_month, y = sampling_mean_max_temp)) +
# scatterplot
geom_point() +
# label everything
labs(
x = "Month",
y = "Mean Maximum Temperature (°C) of the Sampling Intervals",
title = "The Mean Maximum Temperature of the Sampling Intervals Compared\nMonthly Between Years"
) +
# improve the look of the x-axis
# angle text vertical
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1),
legend.position = "none") +
# labelling x axis with abbreviated month names
scale_x_continuous(
breaks = unique(insect_counts$collect_month),
labels = month.abb[unique(insect_counts$collect_month)]
) +
# present in a grid of years
facet_wrap(collect_year ~ ., scales = 'free', ncol=4)

# plot of the sampling interval mean of the mean precipitation grouped by months and compared between years
ggplot(
# input data
data = insect_counts,
mapping = aes(x = collect_month, y = sampling_mean_precipitation)) +
# scatterplot
geom_point() +
# label everything
labs(
x = "Month",
y = "Mean Precipitation (mm) of the Sampling Intervals",
title = "The Mean Precipitation of the Sampling Intervals Compared\nMonthly Between Years"
) +
# improve the look of the x-axis
# angle text vertical
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1),
legend.position = "none") +
# labelling x axis with abbreviated month names
scale_x_continuous(
breaks = unique(insect_counts$collect_month),
labels = month.abb[unique(insect_counts$collect_month)]
) +
# present in a grid of years
facet_wrap(collect_year ~ ., scales = 'free', ncol=4)

# plot of the sampling interval total precipitation grouped by months and compared between years
ggplot(
# input data
data = insect_counts,
mapping = aes(x = collect_month, y = sampling_tot_precipitation)) +
# scatterplot
geom_point() +
# label everything
labs(
x = "Month",
y = "Mean Total Precipitation (mm) of the Sampling Intervals",
title = "The Total Precipitation of the Sampling Intervals Compared\nMonthly Between Years"
) +
# improve the look of the x-axis
# angle text vertical
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1),
legend.position = "none") +
# labelling x axis with abbreviated month names
scale_x_continuous(
breaks = unique(insect_counts$collect_month),
labels = month.abb[unique(insect_counts$collect_month)]
) +
# present in a grid of years
facet_wrap(collect_year ~ ., scales = 'free', ncol=4)

# plot of the sampling interval mean maximum wind gust grouped by months and compared between years
ggplot(
# input data
data = insect_counts,
mapping = aes(x = collect_month, y = sampling_mean_max_gust)) +
# scatterplot
geom_point() +
# label everything
labs(
x = "Month",
y = "Mean Maximum Wind Gust (km/h) of the Sampling Intervals",
title = "The Mean Maximum Wind Gust of the Sampling Intervals Compared\nMonthly Between Years"
) +
# improve the look of the x-axis
# angle text vertical
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1),
legend.position = "none") +
# labelling x axis with abbreviated month names
scale_x_continuous(
breaks = unique(insect_counts$collect_month),
labels = month.abb[unique(insect_counts$collect_month)]
) +
# present in a grid of years
facet_wrap(collect_year ~ ., scales = 'free', ncol=4)

2. Yearly
Calculated the mean temperature, number of frost days (temp <0°C),
number of warm days (daily mean temperature >20°C), and also the
precipitation sum for the winter, growing period, and year (Hallmann et
al., 2017; Seibold et al., 2019; Uhler, 2021).
# climate variables summarized by year
climate_calcs_year <- climate %>%
group_by(year) %>%
summarize(year_mean_temp = mean(daily_mean_temp, na.rm = TRUE),
year_days_below_zero = sum(daily_min_temp < 0, na.rm = TRUE),
year_days_above_25 = sum(daily_max_temp > 25, na.rm = TRUE),
year_tot_precip = sum(daily_tot_precip, na.rm = TRUE)) %>%
# remove extra row of NA, na, NA
na.omit()
# change year to be numeric
climate_calcs_year$year <- as.numeric(climate_calcs_year$year)
# Review to make sense of year trends
str(climate_calcs_year)
## tibble [7 × 5] (S3: tbl_df/tbl/data.frame)
## $ year : num [1:7] 2017 2018 2019 2020 2021 ...
## $ year_mean_temp : num [1:7] 10.3 10.9 10.6 10.6 10.4 ...
## $ year_days_below_zero: int [1:7] 17 5 13 7 14 18 7
## $ year_days_above_25 : int [1:7] 3 2 1 9 13 9 8
## $ year_tot_precip : num [1:7] 618 632 628 862 862 ...
summary(climate_calcs_year)
## year year_mean_temp year_days_below_zero year_days_above_25
## Min. :2017 Min. :10.09 Min. : 5.00 Min. : 1.000
## 1st Qu.:2018 1st Qu.:10.36 1st Qu.: 7.00 1st Qu.: 2.500
## Median :2020 Median :10.56 Median :13.00 Median : 8.000
## Mean :2020 Mean :10.55 Mean :11.57 Mean : 6.429
## 3rd Qu.:2022 3rd Qu.:10.77 3rd Qu.:15.50 3rd Qu.: 9.000
## Max. :2023 Max. :10.91 Max. :18.00 Max. :13.000
## year_tot_precip
## Min. :410.2
## 1st Qu.:570.9
## Median :628.0
## Mean :648.0
## 3rd Qu.:747.0
## Max. :861.8
Visualization to confirm that climate data aggregated at the level of
year is appropriately distributed in time.
# change year to be a factor so every year labelled on x-axis
climate_calcs_year$year <- as.factor(climate_calcs_year$year)
# year mean temperature plot
ggplot(
# input data
data = climate_calcs_year,
mapping = aes(x = year, y = year_mean_temp)) +
# scatterplot
geom_point() +
# label everything
labs(
x = "Year",
y = "Mean Yearly Temperature (ºC)",
title = "Mean Yearly Temperature Compared Between Years"
) +
# improve the look of the x-axis
# angle text vertical
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1),
legend.position = "none") +
scale_x_discrete(labels = xlab)

# year days below zero
ggplot(
# input data
data = climate_calcs_year,
mapping = aes(x = year, y = year_days_below_zero)) +
# scatterplot
geom_point() +
# label everything
labs(
x = "Year",
y = "Number of Days Below 0°C",
title = "Number of Days Below 0°C Compared Between Years"
) +
# improve the look of the x-axis
# angle text vertical
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1),
legend.position = "none")

# year days above 25
ggplot(
# input data
data = climate_calcs_year,
mapping = aes(x = year, y = year_days_above_25)) +
# scatterplot
geom_point() +
# label everything
labs(
x = "Year",
y = "Number of Days Above 25°C",
title = "Number of Days Above 25°C Compared Between Years"
) +
# improve the look of the x-axis
# angle text vertical
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1),
legend.position = "none")

# total precipitation
ggplot(
# input data
data = climate_calcs_year,
mapping = aes(x = year, y = year_tot_precip)) +
# scatterplot
geom_point() +
# label everything
labs(
x = "Year",
y = "Total Precipitation (mm)",
title = "Total Precipitation Compared Between Years"
) +
# improve the look of the x-axis
# angle text vertical
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1),
legend.position = "none")

# put year back to numeric for calculations
climate_calcs_year$year <- c("2017", "2018", "2019", "2020", "2021", "2022", "2023")
climate_calcs_year$year <- as.numeric(climate_calcs_year$year)
3. Monthly
Preparation for final aggregations of climate data. Welti et
al. (2022) incorporated monthly means for maximum and minimum
temperature, as well as monthly total precipitation.
# climate variables summarized by month
climate_calcs_month <- climate %>%
group_by(year, month) %>%
summarise(month_tot_precip = sum(daily_tot_precip, na.rm = TRUE),
month_mean_temp = mean(daily_mean_temp, na.rm = TRUE))
## `summarise()` has grouped output by 'year'. You can override using the
## `.groups` argument.
# change year and month to be numeric
climate_calcs_month$month <- as.numeric(climate_calcs_month$month)
climate_calcs_month$year <- as.numeric(climate_calcs_month$year)
# check out how these are looking
str(climate_calcs_month)
## gropd_df [84 × 4] (S3: grouped_df/tbl_df/tbl/data.frame)
## $ year : num [1:84] 2017 2017 2017 2017 2017 ...
## $ month : num [1:84] 1 2 3 4 5 6 7 8 9 10 ...
## $ month_tot_precip: num [1:84] 37.8 44.2 79.2 37.2 20.4 ...
## $ month_mean_temp : num [1:84] 3.95 4.53 7.02 9.61 12.27 ...
## - attr(*, "groups")= tibble [7 × 2] (S3: tbl_df/tbl/data.frame)
## ..$ year : num [1:7] 2017 2018 2019 2020 2021 ...
## ..$ .rows: list<int> [1:7]
## .. ..$ : int [1:12] 1 2 3 4 5 6 7 8 9 10 ...
## .. ..$ : int [1:12] 13 14 15 16 17 18 19 20 21 22 ...
## .. ..$ : int [1:12] 25 26 27 28 29 30 31 32 33 34 ...
## .. ..$ : int [1:12] 37 38 39 40 41 42 43 44 45 46 ...
## .. ..$ : int [1:12] 49 50 51 52 53 54 55 56 57 58 ...
## .. ..$ : int [1:12] 61 62 63 64 65 66 67 68 69 70 ...
## .. ..$ : int [1:12] 73 74 75 76 77 78 79 80 81 82 ...
## .. ..@ ptype: int(0)
## ..- attr(*, ".drop")= logi TRUE
summary(climate_calcs_month)
## year month month_tot_precip month_mean_temp
## Min. :2017 Min. : 1.00 Min. : 0.00 Min. : 2.521
## 1st Qu.:2018 1st Qu.: 3.75 1st Qu.: 14.10 1st Qu.: 6.641
## Median :2020 Median : 6.50 Median : 36.40 Median :10.142
## Mean :2020 Mean : 6.50 Mean : 54.00 Mean :10.544
## 3rd Qu.:2022 3rd Qu.: 9.25 3rd Qu.: 80.95 3rd Qu.:14.612
## Max. :2023 Max. :12.00 Max. :240.20 Max. :17.465
## NA's :1
Visualization to confirm that the distribution of these monthly
variables makes sense.
# change month to be a factor so all labelled on x-axis
climate_calcs_month$month <- as.factor(climate_calcs_month$month)
# total precipitation
ggplot(
# input data
data = climate_calcs_month,
mapping = aes(x = month, y = month_tot_precip)) +
# scatterplot
geom_point() +
# label everything
labs(
x = "Month",
y = "Total Precipitation (mm)",
title = "Total Monthly Precipitation Compared Between Years"
) +
# improve the look of the x-axis
# angle text vertical
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1),
legend.position = "none") +
# present in a grid of years
facet_wrap(year ~ ., scales = 'free', ncol=4) +
# abbreviate month names on x-axis
scale_x_discrete(labels = month.abb)

# mean temperature
ggplot(
# input data
data = climate_calcs_month,
mapping = aes(x = month, y = month_mean_temp)) +
# scatterplot
geom_point() +
# label everything
labs(
x = "Month",
y = "Mean Temperature (°C)",
title = "Mean Monthly Temperature Compared Between Years"
) +
# improve the look of the x-axis
# angle text vertical
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1),
legend.position = "none") +
# present in a grid of years
facet_wrap(year ~ ., scales = 'free', ncol= 4) +
# abbreviate month names on x-axis
scale_x_discrete(labels = month.abb)

# change month to be a number again
climate_calcs_month$month <- as.numeric(climate_calcs_month$month)
4. Previous year winter
Previous year winter (November-February) precipitation and
temperature (Hallmann et al., 2017; Seibold et al., 2019).
# lets start with the hardest one first, previous winter, which is nov and dec of previous year and jan, feb of current year (ChatGPT for coding support and debugged by me).
prev_year_winter_precip <- climate_calcs_month %>%
# Filter to include only the months November (11), December (12), January (1), and February (2)
filter(month %in% c(11, 12, 1, 2)) %>%
# Adjust year for Nov and Dec to be counted in the next year's total
mutate(adjusted_year = ifelse(month %in% c(11, 12), year + 1, year)) %>%
# Filter for Adjusted_Year in the range 2018–2023
filter(adjusted_year >= 2018 & adjusted_year <= 2023) %>%
# Group by Adjusted_Year
group_by(adjusted_year) %>%
# Summarize total precipitation for these winter months
summarize(prev_year_winter_precip = sum(month_tot_precip, na.rm = TRUE)) %>%
# Rename Adjusted_Year back to Year for merging
rename(year = adjusted_year)
# Join this to the yearly climate calcs
climate_calcs_year <- climate_calcs_year %>%
left_join(prev_year_winter_precip, by = "year")
# remove working dataframe
rm(prev_year_winter_precip)
### Now previous year winter temp
# Add a column for "prev_year_winter_temp"
prev_year_winter_temp <- climate_calcs_month %>%
# Filter to include only the months November (11), December (12), January (1), and February (2)
filter(month %in% c(11, 12, 1, 2)) %>%
# Adjust year for Nov and Dec to be counted in the next year's total
mutate(adjusted_year = ifelse(month %in% c(11, 12), year + 1, year)) %>%
# Filter for Adjusted_Year in the range 2018–2023
filter(adjusted_year >= 2018 & adjusted_year <= 2023) %>%
# Group by Adjusted_Year
group_by(adjusted_year) %>%
# Summarize mean temperature for these winter months
summarize(prev_year_winter_temp = mean(month_mean_temp, na.rm = TRUE)) %>%
# Rename Adjusted_Year back to Year for merging
rename(year = adjusted_year)
# Join this to the yearly climate calcs
climate_calcs_year <- climate_calcs_year %>%
left_join(prev_year_winter_temp, by = "year")
# remove working dataframe
rm(prev_year_winter_temp)
# check out how this looks
str(climate_calcs_year)
## tibble [7 × 7] (S3: tbl_df/tbl/data.frame)
## $ year : num [1:7] 2017 2018 2019 2020 2021 ...
## $ year_mean_temp : num [1:7] 10.3 10.9 10.6 10.6 10.4 ...
## $ year_days_below_zero : int [1:7] 17 5 13 7 14 18 7
## $ year_days_above_25 : int [1:7] 3 2 1 9 13 9 8
## $ year_tot_precip : num [1:7] 618 632 628 862 862 ...
## $ prev_year_winter_precip: num [1:7] NA 463 438 503 489 ...
## $ prev_year_winter_temp : num [1:7] NA 6.03 6.15 6.45 6.16 ...
summary(climate_calcs_year)
## year year_mean_temp year_days_below_zero year_days_above_25
## Min. :2017 Min. :10.09 Min. : 5.00 Min. : 1.000
## 1st Qu.:2018 1st Qu.:10.36 1st Qu.: 7.00 1st Qu.: 2.500
## Median :2020 Median :10.56 Median :13.00 Median : 8.000
## Mean :2020 Mean :10.55 Mean :11.57 Mean : 6.429
## 3rd Qu.:2022 3rd Qu.:10.77 3rd Qu.:15.50 3rd Qu.: 9.000
## Max. :2023 Max. :10.91 Max. :18.00 Max. :13.000
##
## year_tot_precip prev_year_winter_precip prev_year_winter_temp
## Min. :410.2 Min. :221.0 Min. :5.175
## 1st Qu.:570.9 1st Qu.:444.4 1st Qu.:5.673
## Median :628.0 Median :469.9 Median :6.092
## Mean :648.0 Mean :431.9 Mean :5.922
## 3rd Qu.:747.0 3rd Qu.:485.9 3rd Qu.:6.161
## Max. :861.8 Max. :503.4 Max. :6.453
## NA's :1 NA's :1
Visualization to confirm that the distribution of these previous year
variables make sense.
# change year to be a factor so every year labelled on x-axis
climate_calcs_year$year <- as.factor(climate_calcs_year$year)
# previous year winter precipitation
climate_calcs_year %>%
ggplot(
# input data
data = climate_calcs_year,
mapping = aes(x = year, y = prev_year_winter_precip)) +
# scatterplot
geom_point() +
# label everything
labs(
x = "Year",
y = "Total Precipitation (mm) of the Previous Year Winter",
title = "Previous Winter Total Precipitation Compared Between Years"
) +
# improve the look of the x-axis
# angle text vertical
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1),
legend.position = "none")

# previous year temperature
ggplot(
# input data
data = climate_calcs_year,
mapping = aes(x = year, y = prev_year_winter_temp)) +
# scatterplot
geom_point() +
# label everything
labs(
x = "Year",
y = "Mean Temperature (°C) of the Previous Year Winter",
title = "Previous Winter Mean Temperature Compared Between Years"
) +
# improve the look of the x-axis
# angle text vertical
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1),
legend.position = "none")

# put year back to numeric for calculations
climate_calcs_year$year <- c("2017", "2018", "2019", "2020", "2021", "2022", "2023")
climate_calcs_year$year <- as.numeric(climate_calcs_year$year)
5. Current year growing season
As done by Seibold et al. (2019).
# calculate total precipitation during the growing season of the year
growing_szn_tot_precip <- climate_calcs_month %>%
group_by(year) %>%
filter(month %in% c(3, 4, 5, 6, 7, 8, 9, 10)) %>%
summarize(year_growing_szn_tot_precip = sum(month_tot_precip, na.rm = TRUE)) %>%
# remove extra row of NA, na, NA
na.omit()
# join growing season total precipitation back to monthly climate calcs
climate_calcs_year <- climate_calcs_year %>%
left_join(growing_szn_tot_precip)
# remove working dataframe
rm(growing_szn_tot_precip)
# Add climate variables summarized by year to insect_counts
insect_counts <- insect_counts %>%
left_join(climate_calcs_year)
# check out how this looks
str(climate_calcs_year)
## tibble [7 × 8] (S3: tbl_df/tbl/data.frame)
## $ year : num [1:7] 2017 2018 2019 2020 2021 ...
## $ year_mean_temp : num [1:7] 10.3 10.9 10.6 10.6 10.4 ...
## $ year_days_below_zero : int [1:7] 17 5 13 7 14 18 7
## $ year_days_above_25 : int [1:7] 3 2 1 9 13 9 8
## $ year_tot_precip : num [1:7] 618 632 628 862 862 ...
## $ prev_year_winter_precip : num [1:7] NA 463 438 503 489 ...
## $ prev_year_winter_temp : num [1:7] NA 6.03 6.15 6.45 6.16 ...
## $ year_growing_szn_tot_precip: num [1:7] 283 116 286 297 286 ...
summary(climate_calcs_year)
## year year_mean_temp year_days_below_zero year_days_above_25
## Min. :2017 Min. :10.09 Min. : 5.00 Min. : 1.000
## 1st Qu.:2018 1st Qu.:10.36 1st Qu.: 7.00 1st Qu.: 2.500
## Median :2020 Median :10.56 Median :13.00 Median : 8.000
## Mean :2020 Mean :10.55 Mean :11.57 Mean : 6.429
## 3rd Qu.:2022 3rd Qu.:10.77 3rd Qu.:15.50 3rd Qu.: 9.000
## Max. :2023 Max. :10.91 Max. :18.00 Max. :13.000
##
## year_tot_precip prev_year_winter_precip prev_year_winter_temp
## Min. :410.2 Min. :221.0 Min. :5.175
## 1st Qu.:570.9 1st Qu.:444.4 1st Qu.:5.673
## Median :628.0 Median :469.9 Median :6.092
## Mean :648.0 Mean :431.9 Mean :5.922
## 3rd Qu.:747.0 3rd Qu.:485.9 3rd Qu.:6.161
## Max. :861.8 Max. :503.4 Max. :6.453
## NA's :1 NA's :1
## year_growing_szn_tot_precip
## Min. :115.8
## 1st Qu.:208.4
## Median :282.6
## Mean :240.6
## 3rd Qu.:286.0
## Max. :297.2
##
Visualization to confirm precipitation during the growing season
trends are logical.
# change year to be a factor so every year labelled on x-axis
climate_calcs_year$year <- as.factor(climate_calcs_year$year)
# precipitation during the growing season
ggplot(
# input data
data = climate_calcs_year,
mapping = aes(x = year, y = year_growing_szn_tot_precip)) +
# scatterplot
geom_point() +
# label everything
labs(
x = "Year",
y = "Total Precipitation (mm) During the Growing Season",
title = "Total Precipitation During the Growing Season Compared Between Years"
) +
# improve the look of the x-axis
# angle text vertical
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1),
legend.position = "none")

# put year back to numeric for calculations
climate_calcs_year$year <- c("2017", "2018", "2019", "2020", "2021", "2022", "2023")
climate_calcs_year$year <- as.numeric(climate_calcs_year$year)
Join Data Together into One Dataframe
# join location data to the working dataframe
insect_counts <- insect_counts %>%
left_join(traps)
# add landcover data to insect_counts
insect_counts <- insect_counts %>%
left_join(landcover, join_by(trap_number))
# rearrange data
# make list of column names in order
col_order <- c("trap_number",
"latitude",
"longitude",
"year",
"month",
"range_begin",
"range_end",
"collection_date",
"collect_year",
"collect_month",
"collect_day",
"group",
"biomass",
"biomass_total",
"percent_biomass",
"individual_count",
"count_total",
"sampling_mean_min_temp",
"sampling_mean_max_temp",
"sampling_mean_mean_temp",
"sampling_mean_precipitation",
"sampling_mean_max_gust",
"sampling_tot_precipitation",
"year_mean_temp",
"year_days_below_zero",
"year_days_above_25",
"year_tot_precip",
"prev_year_winter_precip",
"prev_year_winter_temp",
"year_growing_szn_tot_precip",
"buffer",
"total_area",
"exposed_soil",
"grass",
"herb",
"shrub_small_tree",
"deciduous_tree",
"coniferous_tree",
"pavement_packed_gravel",
"road",
"buildings",
"agriculture",
"exposed_bedrock",
"ocean",
"lake",
"aq_vegetation",
"sand_gravel_shore",
"bedrock_shore",
"docks",
"loose_gravel",
"natural",
"anthropogenic")
# rearrange columns
insect_counts <- insect_counts[, col_order]
# Review additions to check that they make sense.
str(insect_counts)
## tibble [24,312 × 52] (S3: tbl_df/tbl/data.frame)
## $ trap_number : Factor w/ 20 levels "1","2","3","4",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ latitude : chr [1:24312] "48°24′19.8701″" "48°24′19.8701″" "48°24′19.8701″" "48°24′19.8701″" ...
## $ longitude : chr [1:24312] "-123°33′21.3825″" "-123°33′21.3825″" "-123°33′21.3825″" "-123°33′21.3825″" ...
## $ year : num [1:24312] 2018 2018 2018 2018 2018 ...
## $ month : chr [1:24312] "(July 6 - 20)" "(July 6 - 20)" "(July 6 - 20)" "(July 6 - 20)" ...
## $ range_begin : Date[1:24312], format: "2018-07-06" "2018-07-06" ...
## $ range_end : Date[1:24312], format: "2018-07-20" "2018-07-20" ...
## $ collection_date : Date[1:24312], format: "2018-07-20" "2018-07-20" ...
## $ collect_year : num [1:24312] 2018 2018 2018 2018 2018 ...
## $ collect_month : num [1:24312] 7 7 7 7 7 7 7 7 7 7 ...
## $ collect_day : num [1:24312] 20 20 20 20 20 20 20 20 20 20 ...
## $ group : chr [1:24312] "Hymenoptera" "Hymenoptera" "Hymenoptera" "Hymenoptera" ...
## $ biomass : num [1:24312] 1.18 1.18 1.18 1.18 1.83 1.83 1.83 1.83 5.48 5.48 ...
## $ biomass_total : num [1:24312] 9.15 9.15 9.15 9.15 9.15 9.15 9.15 9.15 9.15 9.15 ...
## $ percent_biomass : num [1:24312] 12.9 12.9 12.9 12.9 20 ...
## $ individual_count : num [1:24312] 127 127 127 127 476 476 476 476 248 248 ...
## $ count_total : num [1:24312] 1046 1046 1046 1046 1046 ...
## $ sampling_mean_min_temp : num [1:24312] 12.8 12.8 12.8 12.8 12.8 ...
## $ sampling_mean_max_temp : num [1:24312] 20.7 20.7 20.7 20.7 20.7 ...
## $ sampling_mean_mean_temp : num [1:24312] 16.7 16.7 16.7 16.7 16.7 ...
## $ sampling_mean_precipitation: num [1:24312] 0 0 0 0 0 0 0 0 0 0 ...
## $ sampling_mean_max_gust : num [1:24312] 39 39 39 39 39 39 39 39 39 39 ...
## $ sampling_tot_precipitation : num [1:24312] 0 0 0 0 0 0 0 0 0 0 ...
## $ year_mean_temp : num [1:24312] 10.9 10.9 10.9 10.9 10.9 ...
## $ year_days_below_zero : int [1:24312] 5 5 5 5 5 5 5 5 5 5 ...
## $ year_days_above_25 : int [1:24312] 2 2 2 2 2 2 2 2 2 2 ...
## $ year_tot_precip : num [1:24312] 632 632 632 632 632 ...
## $ prev_year_winter_precip : num [1:24312] 463 463 463 463 463 ...
## $ prev_year_winter_temp : num [1:24312] 6.03 6.03 6.03 6.03 6.03 ...
## $ year_growing_szn_tot_precip: num [1:24312] 116 116 116 116 116 ...
## $ buffer : num [1:24312] 50 100 250 500 50 100 250 500 50 100 ...
## $ total_area : int [1:24312] 7851 31406 196233 784886 7851 31406 196233 784886 7851 31406 ...
## $ exposed_soil : int [1:24312] 10 131 2285 9803 10 131 2285 9803 10 131 ...
## $ grass : int [1:24312] 148 846 12862 51910 148 846 12862 51910 148 846 ...
## $ herb : int [1:24312] 70 218 6043 20399 70 218 6043 20399 70 218 ...
## $ shrub_small_tree : int [1:24312] 190 1105 14603 51117 190 1105 14603 51117 190 1105 ...
## $ deciduous_tree : int [1:24312] 1505 7138 38664 173481 1505 7138 38664 173481 1505 7138 ...
## $ coniferous_tree : int [1:24312] 5928 21521 112476 446210 5928 21521 112476 446210 5928 21521 ...
## $ pavement_packed_gravel : int [1:24312] 0 11 2036 9968 0 11 2036 9968 0 11 ...
## $ road : int [1:24312] 0 0 4352 12024 0 0 4352 12024 0 0 ...
## $ buildings : int [1:24312] 0 436 2837 8624 0 436 2837 8624 0 436 ...
## $ agriculture : int [1:24312] 0 0 0 0 0 0 0 0 0 0 ...
## $ exposed_bedrock : int [1:24312] 0 0 75 366 0 0 75 366 0 0 ...
## $ ocean : num [1:24312] 0 0 0 0 0 0 0 0 0 0 ...
## $ lake : num [1:24312] 0 0 0 984 0 0 0 984 0 0 ...
## $ aq_vegetation : num [1:24312] 0 0 0 0 0 0 0 0 0 0 ...
## $ sand_gravel_shore : num [1:24312] 0 0 0 0 0 0 0 0 0 0 ...
## $ bedrock_shore : num [1:24312] 0 0 0 0 0 0 0 0 0 0 ...
## $ docks : num [1:24312] 0 0 0 0 0 0 0 0 0 0 ...
## $ loose_gravel : num [1:24312] 0 0 0 0 0 0 0 0 0 0 ...
## $ natural : int [1:24312] 7851 30959 187008 754270 7851 30959 187008 754270 7851 30959 ...
## $ anthropogenic : int [1:24312] 0 447 9225 30616 0 447 9225 30616 0 447 ...
summary(insect_counts)
## trap_number latitude longitude year
## 14 : 1656 Length:24312 Length:24312 Min. :2018
## 9 : 1632 Class :character Class :character 1st Qu.:2019
## 12 : 1632 Mode :character Mode :character Median :2020
## 13 : 1632 Mean :2020
## 4 : 1608 3rd Qu.:2022
## 5 : 1584 Max. :2023
## (Other):14568
## month range_begin range_end
## Length:24312 Min. :2018-07-06 Min. :2018-07-20
## Class :character 1st Qu.:2019-08-08 1st Qu.:2019-08-21
## Mode :character Median :2020-09-01 Median :2020-09-14
## Mean :2020-11-28 Mean :2020-12-10
## 3rd Qu.:2022-05-01 3rd Qu.:2022-05-14
## Max. :2023-10-02 Max. :2023-10-15
##
## collection_date collect_year collect_month collect_day
## Min. :2018-07-20 Min. :2018 Min. : 5.000 Min. : 1.00
## 1st Qu.:2019-08-21 1st Qu.:2019 1st Qu.: 6.000 1st Qu.: 9.00
## Median :2020-09-14 Median :2020 Median : 8.000 Median :16.00
## Mean :2020-12-11 Mean :2020 Mean : 7.694 Mean :16.19
## 3rd Qu.:2022-05-14 3rd Qu.:2022 3rd Qu.: 9.000 3rd Qu.:24.00
## Max. :2023-10-15 Max. :2023 Max. :10.000 Max. :31.00
##
## group biomass biomass_total percent_biomass
## Length:24312 Min. : 0.000 Min. : 0.000 Min. : 0.00
## Class :character 1st Qu.: 0.130 1st Qu.: 1.610 1st Qu.: 4.55
## Mode :character Median : 0.380 Median : 3.830 Median :11.85
## Mean : 1.004 Mean : 6.026 Mean :16.67
## 3rd Qu.: 1.050 3rd Qu.: 8.150 3rd Qu.:24.94
## Max. :35.820 Max. :43.460 Max. :93.78
## NA's :72
## individual_count count_total sampling_mean_min_temp sampling_mean_max_temp
## Min. : 0.0 Min. : 0 Min. : 6.50 Min. :11.69
## 1st Qu.: 32.0 1st Qu.: 659 1st Qu.:10.20 1st Qu.:16.27
## Median : 99.0 Median :1258 Median :11.62 Median :18.19
## Mean : 281.9 Mean :1690 Mean :11.20 Mean :18.14
## 3rd Qu.: 288.0 3rd Qu.:2173 3rd Qu.:12.81 3rd Qu.:20.11
## Max. :8615.0 Max. :9827 Max. :13.55 Max. :23.96
##
## sampling_mean_mean_temp sampling_mean_precipitation sampling_mean_max_gust
## Min. : 9.207 Min. :0.00000 Min. :34.40
## 1st Qu.:13.279 1st Qu.:0.04615 1st Qu.:38.33
## Median :14.929 Median :0.44286 Median :41.30
## Mean :14.670 Mean :0.90678 Mean :41.16
## 3rd Qu.:16.629 3rd Qu.:1.22857 3rd Qu.:43.33
## Max. :18.750 Max. :4.54286 Max. :57.00
##
## sampling_tot_precipitation year_mean_temp year_days_below_zero
## Min. : 0.00 Min. :10.09 Min. : 5.00
## 1st Qu.: 0.60 1st Qu.:10.44 1st Qu.: 7.00
## Median : 6.20 Median :10.56 Median :13.00
## Mean :12.54 Mean :10.58 Mean :10.88
## 3rd Qu.:17.20 3rd Qu.:10.65 3rd Qu.:14.00
## Max. :63.60 Max. :10.91 Max. :18.00
##
## year_days_above_25 year_tot_precip prev_year_winter_precip
## Min. : 1.000 Min. :410.2 Min. :221.0
## 1st Qu.: 2.000 1st Qu.:524.0 1st Qu.:438.0
## Median : 9.000 Median :632.2 Median :476.4
## Mean : 6.934 Mean :681.2 Mean :441.9
## 3rd Qu.: 9.000 3rd Qu.:861.8 3rd Qu.:489.0
## Max. :13.000 Max. :861.8 Max. :503.4
##
## prev_year_winter_temp year_growing_szn_tot_precip buffer
## Min. :5.175 Min. :115.8 Min. : 50.0
## 1st Qu.:5.554 1st Qu.:266.0 1st Qu.: 87.5
## Median :6.154 Median :286.0 Median :175.0
## Mean :6.006 Mean :249.1 Mean :225.0
## 3rd Qu.:6.164 3rd Qu.:286.0 3rd Qu.:312.5
## Max. :6.453 Max. :297.2 Max. :500.0
##
## total_area exposed_soil grass herb
## Min. : 7845 Min. : 0 Min. : 0 Min. : 0
## 1st Qu.: 25500 1st Qu.: 97 1st Qu.: 980 1st Qu.: 541
## Median :113800 Median : 794 Median : 6273 Median : 3037
## Mean :255088 Mean : 3642 Mean : 20540 Mean : 9422
## 3rd Qu.:343394 3rd Qu.: 5581 3rd Qu.: 32180 3rd Qu.:14363
## Max. :784915 Max. :25191 Max. :146646 Max. :46590
##
## shrub_small_tree deciduous_tree coniferous_tree pavement_packed_gravel
## Min. : 43 Min. : 804 Min. : 1154 Min. : 0
## 1st Qu.: 1050 1st Qu.: 3378 1st Qu.: 5697 1st Qu.: 130
## Median : 5244 Median : 18979 Median : 26842 Median : 841
## Mean :16029 Mean : 48624 Mean :105383 Mean : 3731
## 3rd Qu.:23167 3rd Qu.: 71200 3rd Qu.:131309 3rd Qu.: 5480
## Max. :70352 Max. :230305 Max. :549144 Max. :34899
##
## road buildings agriculture exposed_bedrock
## Min. : 0 Min. : 0 Min. : 0 Min. : 0
## 1st Qu.: 0 1st Qu.: 62 1st Qu.: 0 1st Qu.: 0
## Median : 892 Median : 617 Median : 0 Median : 0
## Mean : 3434 Mean : 3137 Mean : 27881 Mean : 533
## 3rd Qu.: 4591 3rd Qu.: 3775 3rd Qu.: 11852 3rd Qu.: 105
## Max. :25632 Max. :37996 Max. :345214 Max. :22785
##
## ocean lake aq_vegetation sand_gravel_shore
## Min. : 0 Min. : 0.0 Min. : 0.00 Min. : 0
## 1st Qu.: 0 1st Qu.: 0.0 1st Qu.: 0.00 1st Qu.: 0
## Median : 0 Median : 0.0 Median : 0.00 Median : 0
## Mean : 9969 Mean : 930.8 Mean : 43.33 Mean : 1474
## 3rd Qu.: 0 3rd Qu.: 900.0 3rd Qu.: 1.00 3rd Qu.: 0
## Max. :218199 Max. :28164.0 Max. :906.00 Max. :26665
##
## bedrock_shore docks loose_gravel natural
## Min. : 0.0 Min. : 0.000 Min. : 0.00 Min. : 3020
## 1st Qu.: 0.0 1st Qu.: 0.000 1st Qu.: 0.00 1st Qu.: 13826
## Median : 0.0 Median : 0.000 Median : 0.00 Median : 63030
## Mean : 297.6 Mean : 2.113 Mean : 16.48 Mean :216903
## 3rd Qu.: 0.0 3rd Qu.: 0.000 3rd Qu.: 0.00 3rd Qu.:246791
## Max. :15989.0 Max. :85.000 Max. :1012.00 Max. :777399
##
## anthropogenic
## Min. : 0
## 1st Qu.: 946
## Median : 7708
## Mean : 38186
## 3rd Qu.: 33490
## Max. :383525
##
References
Blüthgen, N., Staab, M., Achury, R., & Weisser, W. W. (2022).
Unravelling insect declines: Can space replace time? Biology
Letters, 18(4), 20210666. https://doi.org/10.1098/rsbl.2021.0666
Gebert, F., Bollmann, K., Schuwirth, N., Duelli, P., Weber, D., &
Obrist, M. K. (2024). Similar temporal patterns in insect richness,
abundance and biomass across major habitat types. Insect
Conservation and Diversity, 17(1), 139–154. https://doi.org/10.1111/icad.12700
Innes, F. (2024). Conservation of biodiversity: Insect species
abundance and biomass trends in a rural municipality. Unpublished
report. Directed Studies/BIOL490B, University of Victoria.
Hallmann, C. A., Sorg, M., Jongejans, E., Siepel, H., Hofland, N.,
Schwan, H., Stenmans, W., Müller, A., Sumser, H., Hörren, T., Goulson,
D., & Kroon, H. de. (2017). More than 75 percent decline over 27
years in total flying insect biomass in protected areas. PLOS
ONE, 12(10), e0185809. https://doi.org/10.1371/journal.pone.0185809
Hallmann, C. A., Zeegers, T., van Klink, R., Vermeulen, R., van
Wielink, P., Spijkers, H., van Deijk, J., van Steenis, W., &
Jongejans, E. (2020). Declining abundance of beetles, moths and
caddisflies in the Netherlands. Insect Conservation and
Diversity, 13(2), 127–139. https://doi.org/10.1111/icad.12377
Li, H., & Wilkins, K. T. (2022). Predator-prey relationship
between urban bats and insects Impacted by both artificial light at
night and spatial clutter. Biology, 11(6), Article 6.
https://doi.org/10.3390/biology11060829
Müller, J., Hothorn, T., Yuan, Y., Seibold, S., Mitesser, O.,
Rothacher, J., Freund, J., Wild, C., Wolz, M., & Menzel, A. (2024).
Weather explains the decline and rise of insect biomass over 34 years.
Nature, 628(8007), 349–354. https://doi.org/10.1038/s41586-023-06402-z
Nikel, K. (2019). Estimating flying insect biomass and its
spatiotemporal distribution in Metchosin: Biodiversity on southern
Vancouver Island. Unpublished report. Honours Thesis, University of
Victoria.
Seibold, S., Gossner, M. M., Simons, N. K., Blüthgen, N., Müller, J.,
Ambarlı, D., Ammer, C., Bauhus, J., Fischer, M., Habel, J. C.,
Linsenmair, K. E., Nauss, T., Penone, C., Prati, D., Schall, P.,
Schulze, E.-D., Vogt, J., Wöllauer, S., & Weisser, W. W. (2019).
Arthropod decline in grasslands and forests is associated with
landscape-level drivers. Nature, 574(7780), 671–674.
https://doi.org/10.1038/s41586-019-1684-3
Svenningsen, C. S., Peters, B., Bowler, D. E., Dunn, R. R., Bonn, A.,
& Tøttrup, A. P. (2024). Insect biomass shows a stronger decrease
than species richness along urban gradients. Insect Conservation and
Diversity, 17(2), 182–188. https://doi.org/10.1111/icad.12694
Uhler, J., Redlich, S., Zhang, J., Hothorn, T., Tobisch, C., Ewald,
J., Thorn, S., Seibold, S., Mitesser, O., Morinière, J., Bozicevic, V.,
Benjamin, C. S., Englmeier, J., Fricke, U., Ganuza, C., Haensel, M.,
Riebl, R., Rojas-Botero, S., Rummler, T., … Müller, J. (2021).
Relationship of insect biomass and richness with land use along a
climate gradient. Nature Communications, 12(1), 5946.
https://doi.org/10.1038/s41467-021-26181-3
Uphus, L., Uhler, J., Tobisch, C., Rojas-Botero, S., Lüpke, M.,
Benjamin, C., Englmeier, J., Fricke, U., Ganuza, C., Haensel, M.,
Redlich, S., Zhang, J., Müller, J., & Menzel, A. (2023). Earlier and
more uniform spring green-up linked to lower insect richness and biomass
in temperate forests. Communications Biology, 6(1),
1–11. https://doi.org/10.1038/s42003-023-05422-9
Welti, E. A. R., Zajicek, P., Frenzel, M., Ayasse, M., Bornholdt, T.,
Buse, J., Classen, A., Dziock, F., Engelmann, R. A., Englmeier, J.,
Fellendorf, M., Förschler, M. I., Fricke, U., Ganuza, C., Hippke, M.,
Hoenselaar, G., Kaus-Thiel, A., Kerner, J., Kilian, D., … Haase, P.
(2022). Temperature drives variation in flying insect biomass across a
German malaise trap network. Insect Conservation and Diversity,
15(2), 168–180. https://doi.org/10.1111/icad.12555